In [1]:
from matplotlib import pyplot as plt
import numpy as np
import pandas as pd
import geopandas as gpd
import altair as alt
In [2]:
pd.options.display.max_columns = 999
In [3]:
plt.rcParams['figure.figsize'] = (10,6)

Week 9B: OpenStreetMap, Urban Networks, and Interactive Web Maps

Nov 3, 2021

This week: OpenStreetMap (OSM)

Three parts:

  • OSMnx: downloading and manipulating streets as networks
  • Pandana: networks focused on accessibility of amenities, e.g., walking distances to the nearest amenities
  • Related: interactive web maps in Python
In [4]:
import osmnx as ox

Part 3: Interactive maps in Python

Haven't we already done this?

Yes!

We've used hvplot, holoviews, datashader, etc. to create interactive map-based visualizations

Why do we need something more?

The benefits of Leaflet

  • The leading open-source mapping library
  • Simple and powerful
  • Leverage the open-source community and lots of powerful plugins

Folium: Leaflet in Python

Pros

  • Create Leaflet.js maps directly from Python
  • Combine power of Leaflet.js with the data wrangling ease of Python

Cons

  • A wrapper for most, but not all of Leaflet's functionality
  • Can be difficult to debug and find errors

Revisiting routing with OSMnx

OSMnx leverages Folium under the hood to make interactive graphs of street networks!

Key function: ox.plot_graph_folium will make an interactive map of the graph object

Load the street network around City Hall

In [5]:
G = ox.graph_from_address('City Hall, Philadelphia, USA', 
                          dist=1500, 
                          network_type='drive')
In [6]:
# plot the street network with folium
graph_map = ox.plot_graph_folium(G, 
                                 popup_attribute='name', 
                                 edge_width=2)
/Users/nhand/opt/miniconda3/envs/musa-550-fall-2021/lib/python3.8/site-packages/osmnx/folium.py:67: UserWarning: `edge_width` has been deprecated and will be removed: use kwargs instead
  warn("`edge_width` has been deprecated and will be removed: use kwargs instead")
In [7]:
ox.plot_graph_folium?

And now save the map object and load it into the Jupyter notebook

In [8]:
from IPython.display import IFrame # loads HTML files
In [9]:
filepath = 'graph.html'
graph_map.save(filepath)
IFrame(filepath, width=600, height=500)
Out[9]:

Note

Folium map objects are supposed to render automatically in the Jupyter notebook, so if you output a Folium map from a notebook cell, it will render.

However, there a lot of times when the map won't show up properly, especially if the map has a large amount of data. Saving the file locally and loading it to the notebook via an IFrame will always work.

In [10]:
type(graph_map)
Out[10]:
folium.folium.Map
In [11]:
graph_map
Out[11]:
Make this Notebook Trusted to load map: File -> Trust Notebook

Exercise: shortest route between the Liberty Bell and Art Museum

Let's calculate the shortest route between the Art Museum and the Liberty Bell.

See last lecture (Lecture 9A) for a guide!

Step 1: Download amenity info from OSM using OSMnx

Use OSMnx to download all amenities in Philadelphia of type "tourism".

  • The ox.geometries_from_place() can download OSM features with a specific tag
  • Consult the OSM pages (the Art Museum and Liberty Bell) for each feature for additional info
  • Both features are categorized as "tourism" in the OSM data — use the "tags" keyword to select this category
In [ ]:
 

Step 2: Identify the Art Museum and Liberty Bell geometries

You should notice we have the building footprint for the Art Museum (a polygon geometry) and the point location for the Liberty Bell.

  • The names of the features are "Philadelphia Museum of Art" and "Liberty Bell" — you can identify these names using the OSM website
  • The .squeeze() function can be useful for converting to a Series object from a DataFrame of length 1
In [ ]:
 

Step 3: Extract the lat and lng coordinates

For the Art Museum geometry, we can use the .geometry.centroid attribute to calculate the centroid of the building footprint.

In [ ]:
 

Step 4: Find the nearest nodes on our OSMnx graph

Use the street network graph around City Hall and the ox.get_nearest_node() function to find the starting and ending nodes for the trip.

In [ ]:
 

Step 5: Use networkx to find the shortest path between nodes

The nx.shortest_path() will do the calculation for you!

In [23]:
import networkx as nx
In [ ]:
 

Example: interactive maps of network routes

Now, we can overlay the shortest route between two nodes on the folium map.

Key function: use ox.plot_route_folium to plot the route.

In [25]:
# plot the route with folium
route_map = ox.plot_route_folium(G, route)
In [26]:
filepath = 'route.html'
route_map.save(filepath)
IFrame(filepath, width=600, height=500)
Out[26]:

We can also add the underlying street network graph

In [27]:
# plot the route with folium on top of the previously created graph_map
route_graph_map = ox.plot_route_folium(G, route, route_map=graph_map)
In [28]:
# save as html file then display map as an iframe
filepath = 'route_graph.html'
route_graph_map.save(filepath)
IFrame(filepath, width=600, height=500)
Out[28]:

Note the Leaflet annotation in the bottom right corner of the maps...

Getting started with Folium

Things we'll cover:

  1. Creating a base map with tiles
  2. Overlaying GeoJSON polygons
  3. Plotting an interactive choropleth
In [29]:
import folium

3.1 Creating a Folium map

Key function: folium.Map

Lots of configuration options

Some key ones:

  • location: the center location of the map
  • zoom_start: the initial zoom level of the map
  • tiles: the name of the tile provider

Let's take a look at the help message:

In [30]:
folium.Map?

The default tiles: OpenStreetMap

In [31]:
# let's center the map on Philadelphia
m = folium.Map(
    location=[39.99, -75.13],
    zoom_start=11
)

m
Out[31]:
Make this Notebook Trusted to load map: File -> Trust Notebook
In [32]:
m = folium.Map(
    location=[39.99, -75.13],
    zoom_start=11,
     tiles='stamenwatercolor'
)

m
Out[32]:
Make this Notebook Trusted to load map: File -> Trust Notebook

Using custom tile sets

Let's try out the ESRI World Map:

Important: for custom tile providers, you need to specify the attribution too!

In [33]:
tile_url = 'https://server.arcgisonline.com/ArcGIS/rest/services/World_Street_Map/MapServer/tile/{z}/{y}/{x}'
attr = 'Tiles © Esri — Source: Esri, DeLorme, NAVTEQ, USGS, Intermap, iPC, NRCAN, Esri Japan, METI, Esri China (Hong Kong), Esri (Thailand), TomTom, 2012'
In [34]:
m = folium.Map(
    location=[39.99, -75.13],
    zoom_start=11,
    tiles=tile_url,
    attr=attr
)

m
Out[34]:
Make this Notebook Trusted to load map: File -> Trust Notebook

3.2 Overlaying GeoJSON on a folium map

Key function: folium.GeoJson

Key parameters:

  • style_function: set the default style of the features
  • highlight_function: set the style when the mouse hovers over the features
  • tooltip: add a tooltip when hovering over a feature
In [35]:
folium.GeoJson?
In [36]:
folium.GeoJsonTooltip?

Example: Philadelphia ZIP codes & Neighborhoods

In [37]:
# Load neighborhoods from GitHub
url = "https://github.com/azavea/geo-data/raw/master/Neighborhoods_Philadelphia/Neighborhoods_Philadelphia.geojson"
hoods = gpd.read_file(url).rename(columns={"mapname": "neighborhood"})
In [38]:
# Load ZIP codes from Open Data Philly
zip_url = "http://data.phl.opendata.arcgis.com/datasets/b54ec5210cee41c3a884c9086f7af1be_0.geojson"
zip_codes = gpd.read_file(zip_url).rename(columns={"CODE":"ZIP Code"})
In [39]:
ax = ox.project_gdf(hoods).plot(fc="lightblue", ec="gray")
ax.set_axis_off()
In [40]:
ax = ox.project_gdf(zip_codes).plot(fc="lightblue", ec="gray")
ax.set_axis_off()

Define functions to set the styles:

In [41]:
def get_zip_code_style(feature):
    """Return a style dict."""
    return {"weight": 2, "color": "white"}


def get_neighborhood_style(feature):
    """Return a style dict."""
    return {"weight": 2, "color": "lightblue", "fillOpacity": 0.1}


def get_highlighted_style(feature):
    """Return a style dict when highlighting a feature."""
    return {"weight": 2, "color": "red"}

Usual Leaflet/Folium syntax

  1. Create the map
  2. Create your overlay layer
  3. Add your overlay layer to your map
In [42]:
# Create the map
m = folium.Map(
    location=[39.99, -75.13],
    tiles='Cartodb dark_matter',
    zoom_start=11
)

# Add the ZIP Codes GeoJson to the map
folium.GeoJson(
    zip_codes.to_crs(epsg=4326).to_json(), # IMPORTANT: make sure CRS is lat/lng (EPSG=4326)
    name='Philadelphia ZIP_codes',
    style_function=get_zip_code_style,
    highlight_function=get_highlighted_style,
    tooltip=folium.GeoJsonTooltip(['ZIP Code'])
).add_to(m)


# Add a SECOND layer for neighborhoods
folium.GeoJson(
    hoods.to_crs(epsg=4326).to_json(), # IMPORTANT: make sure CRS is lat/lng (EPSG=4326)
    name='Neighborhoods',
    style_function=get_neighborhood_style,
    highlight_function=get_highlighted_style,
    tooltip=folium.GeoJsonTooltip(['neighborhood'])
).add_to(m)


# Also add option to toggle layers
folium.LayerControl().add_to(m)

m
Out[42]:
Make this Notebook Trusted to load map: File -> Trust Notebook

Important notes:

  • The data should be passed as GeoJSON rather than a GeoDataFrame — you need to call .to_json()
  • I've added a LayerControl to toggle different layers on the map
  • I've specified a tooltip using folium.GeoJsonTooltip

3.3 Plotting a choropleth map

Overlay GeoJSON features on an interactive map, colored by a specific data variable

At-home exercise: load data for internet availability in US counties

In the interest of time, I've used cenpy to download data for internet availability from the 2019 5-year ACS, but a good at-home exercise is to try to replicate my work.

In [51]:
census_data = pd.read_csv("./data/internet_avail_census.csv", dtype={"geoid": str})
In [52]:
# Remove counties with no households
valid = census_data['universe'] > 0
census_data = census_data.loc[valid]

# Calculate the percent without internet
census_data['percent_no_internet'] = census_data['no_internet'] / census_data['universe']
In [53]:
census_data.head()
Out[53]:
NAME universe no_internet state county geoid percent_no_internet
0 Washington County, Mississippi 18299.0 6166.0 28 151 28151 0.336958
1 Perry County, Mississippi 4563.0 1415.0 28 111 28111 0.310103
2 Choctaw County, Mississippi 3164.0 1167.0 28 19 28019 0.368837
3 Itawamba County, Mississippi 8706.0 1970.0 28 57 28057 0.226281
4 Carroll County, Mississippi 3658.0 1218.0 28 15 28015 0.332969

Load counties from the data folder as well:

In [54]:
counties = gpd.read_file("./data/us-counties-10m.geojson")
In [55]:
counties.head()
Out[55]:
id geometry
0 53073 MULTIPOLYGON (((-120.85361 49.00011, -120.7674...
1 30105 POLYGON ((-106.11238 48.99904, -106.15187 48.8...
2 30029 POLYGON ((-114.06985 48.99904, -114.05908 48.8...
3 16021 POLYGON ((-116.04755 49.00065, -116.04755 48.5...
4 30071 POLYGON ((-107.17840 49.00011, -107.20712 48.9...

Now let's make the choropleth..

The easy way: use folium.Choropleth

  • The good:
    • Automatically generate a choropleth from a set of features and corresponding pandas DataFrame
    • Automatic creation of a legend
  • The bad:
    • no tooltip and little highlight interactivity (currently being worked on)
In [56]:
folium.Choropleth?

Steps:

  1. Pass the geometry data (counties) as GeoJSON in lat/lng CRS
  2. Pass in the census data separately
  3. Pass in the column that we match the geometries on (key_on=), using the GeoJSON "features.properties." syntax
  4. Pass the data key (column to match the data on) and the data value (the column to color the geometries by) via the columns= keyword
In [58]:
m = folium.Map(location=[40, -98], zoom_start=4)

# Convert the counties geometries into GeoJSON
counties_geojson = counties.to_crs(epsg=4326).to_json()

folium.Choropleth(
    geo_data=counties_geojson, # Pass in GeoJSON data for counties
    data=census_data, # the census data
    columns=["geoid", 'percent_no_internet'], # First column must be the key, second the values
    key_on="feature.properties.id", # Key to match on in the geometries
    fill_color='RdPu', # any ColorBrewer name will work here
    fill_opacity=0.7,
    line_opacity=1,
    line_weight=0.5,
    legend_name='Households without Internet (%)',
    name='choropleth',
).add_to(m)


m
Out[58]:
Make this Notebook Trusted to load map: File -> Trust Notebook

The hard way: use folium.GeoJson

  • The good:
    • more customizable, and can add user interaction
  • The bad:

The steps involved

  1. Join data and features
  2. Define a function to style features based on data values
  3. Create GeoJSON layer and add it to the map

Step 1: Join the census data and features

In [59]:
census_joined = pd.merge(
    counties[["id", "geometry"]], census_data, left_on="id", right_on="geoid"
)
In [60]:
census_joined.head()
Out[60]:
id geometry NAME universe no_internet state county geoid percent_no_internet
0 53073 MULTIPOLYGON (((-120.85361 49.00011, -120.7674... Whatcom County, Washington 85008.0 9189.0 53 73 53073 0.108096
1 30105 POLYGON ((-106.11238 48.99904, -106.15187 48.8... Valley County, Montana 3436.0 672.0 30 105 30105 0.195576
2 30029 POLYGON ((-114.06985 48.99904, -114.05908 48.8... Flathead County, Montana 38252.0 5662.0 30 29 30029 0.148018
3 16021 POLYGON ((-116.04755 49.00065, -116.04755 48.5... Boundary County, Idaho 4605.0 1004.0 16 21 16021 0.218024
4 30071 POLYGON ((-107.17840 49.00011, -107.20712 48.9... Phillips County, Montana 1770.0 484.0 30 71 30071 0.273446

Step 2: Normalize the data column to 0 to 1

We will use a matplotlib colorbar and it requires data to be between 0 and 1

In [61]:
x = census_joined['percent_no_internet']
census_joined['percent_no_internet_normalized'] = (x - x.min()) / (x.max() - x.min())

Step 3: Define a style function

  • create a matplotlib colorbar
  • based on the feature data, evaluate the colorbar and convert to a hex string
In [62]:
import matplotlib.colors as mcolors
In [63]:
# use a red-purple colorbrewer color scheme
cmap = plt.get_cmap('RdPu')
In [64]:
cmap(0.5)
Out[64]:
(0.9670280661284122, 0.40464436755094196, 0.6307574009996155, 1.0)
In [65]:
mcolors.rgb2hex(cmap(0))
Out[65]:
'#fff7f3'
In [66]:
mcolors.rgb2hex(cmap(1.0))
Out[66]:
'#49006a'
In [67]:
def get_style(feature):
    # get the data value from the feature
    value = feature['properties']['percent_no_internet_normalized']
    
    # evaluate the color map
    # NOTE: value must between 0 and 1
    rgb_color = cmap(value) # this is a RGB tuple
    
    # convert to hex string
    color = mcolors.rgb2hex(rgb_color)
    
    # return the style dictionary
    return {'weight': 0.25, 'color': color, 'fillColor': color, "fillOpacity": 0.75}

Step 4: Convert our data to GeoJSON

Be sure to only keep the data columns we'll actually use!

In [68]:
needed_cols = ['NAME', 'percent_no_internet', 'percent_no_internet_normalized', 'geometry']
census_json = census_joined[needed_cols].to_json()
In [69]:
# initialize the map
m = folium.Map(location=[40, -98], zoom_start=4)

# add the GeoJson to the map
folium.GeoJson(
    census_json,
    name='choropleth',
    style_function=get_style,
    highlight_function=get_highlighted_style,
    tooltip=folium.GeoJsonTooltip(['NAME', 'percent_no_internet'])
).add_to(m)


folium.LayerControl().add_to(m)

# avoid a rendering bug by saving as HTML and re-loading
m.save('percent_no_internet.html')

And viola!

The hard way is harder, but we have a tooltip and highlight interactivity!

In [70]:
IFrame('percent_no_internet.html', width=800, height=500)
Out[70]:

At-home exercise: Can we repeat this with altair?

Try to replicate the above interactive map exactly (minus the background tiles). This includes:

  • Using the blue purple colorbrewer scheme
  • Having a tooltip with the percentage and county name

Altair's syntax is similar to the folium.Choropleth syntax — you should pass the counties GeoDataFrame to the alt.Chart() and then use the transform_lookup() to merge those geometries to the census data and pull in the census data column we need ("percent_without_internet").

Hints

In [ ]:
 

That's it!

  • Next week:
    • One final item for Folium: plugins
    • Clustering and the start of machine learning
  • See you on Monday!
In [ ]: